OFRewind: Enabling Record and Replay Troubleshooting for Networks

نویسندگان

  • Andreas Wundsam
  • Dan Levin
  • Srini Seetharaman
  • Anja Feldmann
چکیده

Debugging operational networks can be a daunting task, due to their size, distributed state, and the presence of black box components such as commercial routers and switches, which are poorly instrumentable and only coarsely configurable. The debugging tool set available to administrators is limited, and provides only aggregated statistics (SNMP), sampled data (NetFlow/sFlow), or local measurements on single hosts (tcpdump). In this paper, we leverage split forwarding architectures such as OpenFlow to add record and replay debugging capabilities to networks – a powerful, yet currently lacking approach. We present the design of OFRewind, which enables scalable, multi-granularity, temporally consistent recording and coordinated replay in a network, with finegrained, dynamic, centrally orchestrated control over both record and replay. Thus, OFRewind helps operators to reproduce software errors, identify data-path limitations, or locate configuration errors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CERIAS Tech Report 2015-12 Software and Hardware Approaches for Record and Replay of Wireless Sensor Networks

Tan Creti, Matthew Edward Ph.D., Purdue University, August 2015. Software and Hardware Approaches for Record and Replay of Wireless Sensor Networks. Major Professor: Saurabh Bagchi. Wireless Sensor Networks (WSNs) are used in a wide variety of applications including environmental monitoring, electrical grids, and manufacturing plants. WSNs are plagued by the possibility of bugs manifesting only...

متن کامل

Processor-Oblivious Record and Replay

Record-and-replay systems are useful tools for debugging non-deterministic parallel programs by first recording an execution and then replaying that execution to produce the same access pattern. Existing record-and-replay systems generally target thread-based execution models, and record the behaviors and interleavings of individual threads. Dynamic multithreaded languages and libraries, such a...

متن کامل

DCR: Replay-Debugging for the Datacenter

We’ve built a tool for debugging non-deterministic failures in production datacenter applications. Our system, called DCR, is the first to efficiently record and replay large scale, distributed, and data-intensive systems such as HDFS/GFS, HBase/Bigtable, and Hadoop/MapReduce. The enabling idea behind DCR is that debugging doesn’t require a precise replica of the original datacenter run. Instea...

متن کامل

Record/Play in the Presence of Benign Data Races

In this article we present our experience with the integration of record/replay in the Jalape~ no virtual machine. The goal of record/replay is to be able to faithfully replay an application. Previous work in Jalape~ no focused on the replay of Java applications on uni-processors. Here we describe additional work done to obtain replay with low intrusion on multi-processor systems by doing `orde...

متن کامل

End-User Record and Replay for the Web

End-User Record and Replay for the Web

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011